Joint UD Parsing of Norwegian Bokmål and Nynorsk

نویسندگان

  • Erik Velldal
  • Lilja Øvrelid
  • Petter Hohle
چکیده

This paper investigates interactions in parser performance for the two official standards for written Norwegian: Bokmål and Nynorsk. We demonstrate that while applying models across standards yields poor performance, combining the training data for both standards yields better results than previously achieved for each of them in isolation. This has immediate practical value for processing Norwegian, as it means that a single parsing pipeline is sufficient to cover both varieties, with no loss in accuracy. Based on the Norwegian Universal Dependencies treebank we present results for multiple taggers and parsers, experimenting with different ways of varying the training data given to the learners, including the use of machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Aided Translation between the Two Norwegian Languages Norwegian-Bokmål and Norwegian-Nynorsk

The article describes essential parts of a prototype system for machine aided translation from Norwegian-Bokmal to Norwegian-Nynorsk. The central parts of the system are a bilingual word list, inflection paradigms, phrases, and routines to deal with compound words. There are also synt2ictic and semantic rules, but they can be considered as preliminary. The article also includes a simple compari...

متن کامل

The Norwegian Dependency Treebank

The Norwegian Dependency Treebank is a new syntactic treebank for Norwegian Bokmål and Nynorsk with manual syntactic and morphological annotation, developed at the National Library of Norway in collaboration with the University of Oslo. It is the first publically available treebank for Norwegian. This paper presents the core principles behind the syntactic annotation and how these principles we...

متن کامل

Reuse of Free Resources in Machine Translation between Nynorsk and Bokmål

We describe the development of a two-way shallow-transfer machine translation system between Norwegian Nynorsk and Norwegian Bokmål built on the Apertium platform, using the Free and Open Source resources Norsk Ordbank and the Oslo–Bergen Constraint Grammar tagger. We detail the integration of these and other resources in the system along with the construction of the lexical and structural tran...

متن کامل

Building gold-standard treebanks for Norwegian

Språkbanken at the National Library of Norway is currently building up gold-standard Dependency Grammar treebanks for Norwegian Bokmål and Nynorsk. The treebanks are manually annotated for morphological features, syntactic functions and dependency relations. This paper explains the choice of texts and format of the treebanks, some key aspects of the morphological and syntactic annotation, and i...

متن کامل

IBM ’ s Norwegian Grammar Project , 1988 – 1991 Jan

During the years 1988–1991, IBM Norway developed a broadcoverage grammar for Norwegian Bokmål as part of an international corporate effort to create writing tools for all platforms and for all major language communities where IBM had business at that time. The grammar was based on IBM’s own lexicon and morphology modules and a key factor of the technology was the programming language PLNLP. The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017